Skip to content

Conversation

@lionpeloux
Copy link
Contributor

@lionpeloux lionpeloux commented Sep 22, 2025

From #2929

We ensure that auto-generated tool ouput name are properly sanitazed so they conform to [a-z0-9-_].
All char not in this pattern will simply be ignored/skipped in the final name.

The error was first remarked when using generic classes as output_type, where brakets in the tool name would be rejected by the provider API.

@lionpeloux lionpeloux changed the title add sanitization to auto-generated output tool name Add sanitization to auto-generated output tool name Sep 22, 2025
@lionpeloux lionpeloux marked this pull request as ready for review September 22, 2025 10:40
@DouweM
Copy link
Collaborator

DouweM commented Sep 26, 2025

@lionpeloux Thanks Lionel! Can you please add a test?

@github-actions
Copy link

github-actions bot commented Oct 4, 2025

This PR is stale, and will be closed in 3 days if no reply is received.

@github-actions github-actions bot added the Stale label Oct 4, 2025
@lionpeloux
Copy link
Contributor Author

I’ll get back to it next week hopefully.

@github-actions github-actions bot removed the Stale label Oct 5, 2025
@github-actions
Copy link

This PR is stale, and will be closed in 3 days if no reply is received.

@github-actions github-actions bot added the Stale label Oct 12, 2025
@DouweM DouweM removed the Stale label Oct 13, 2025
@lionpeloux
Copy link
Contributor Author

Fix Applied

I've identified and fixed the bug in the tool name sanitization regex.

The Issue

The original regex pattern [^a-z0-9-_] was only allowing lowercase letters, which caused it to remove uppercase letters from class names. For example:

  • Foooo (the F was stripped)
  • Barar (the B was stripped)

This caused all the test failures you were seeing.

The Fix

Changed the regex pattern to [^a-zA-Z0-9-_] to preserve both uppercase and lowercase letters while still removing invalid characters like brackets from generic type names.

Testing

The PR should now pass all CI checks!

DouweM and others added 4 commits October 14, 2025 10:12
The previous regex pattern `[^a-z0-9-_]` was removing uppercase letters
from class names, causing test failures. For example, "Foo" became "oo".

This fix changes the pattern to `[^a-zA-Z0-9-_]` to preserve both
uppercase and lowercase letters while still removing invalid characters
like brackets from generic type names (e.g., `Result[StringData]`).

Also adds a test case to verify that generic class names with brackets
are properly sanitized while preserving valid characters.

Fixes test failures in test_response_multiple_return_tools and related tests.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@lionpeloux lionpeloux force-pushed the pr-tool-output-naming branch from 74d15fb to b445f0b Compare October 14, 2025 08:12
@lionpeloux
Copy link
Contributor Author

PR Branch Updated with Latest Main

I've successfully rebased the PR branch onto the latest main branch. The branch is now up to date with all recent changes, including:

  • 48 commits from main since the original PR was created
  • Major updates including image generation support, OpenTelemetry instrumentation v3, MCP improvements, and more

What Changed

The PR now includes:

  1. ✅ The original tool name sanitization feature
  2. ✅ The bug fix for uppercase letter preservation in class names
  3. ✅ A comprehensive test case for generic type sanitization
  4. ✅ All the latest changes from main

Testing Status

All tests pass locally after the rebase, confirming that the fix is compatible with the latest codebase changes.

The CI should now run against the updated branch and all tests should pass! 🚀

Resolved conflicts by:
- Accepting all incoming changes from main
- Preserving the tool name sanitization fix in _output.py
- Re-adding the test_output_type_generic_class_name_sanitization test

The PR now includes:
- Tool name sanitization with uppercase letter support [^a-zA-Z0-9-_]
- Comprehensive test for generic type bracket removal
- All latest changes from main (Prefect support, image generation, etc.)
@lionpeloux
Copy link
Contributor Author

@DouweM I've added the test. Let me know it something goes wrong.

Comment on lines 666 to 674
assert len(m.last_model_request_parameters.output_tools) == 2

# Check that tool names don't contain brackets
tool_names = [tool.name for tool in m.last_model_request_parameters.output_tools]
for tool_name in tool_names:
assert '[' not in tool_name, f"Tool name '{tool_name}' contains brackets"
assert ']' not in tool_name, f"Tool name '{tool_name}' contains brackets"
# Verify the name follows the pattern [a-zA-Z0-9_-]
assert re.match(r'^[a-zA-Z0-9_-]+$', tool_name), f"Tool name '{tool_name}' contains invalid characters"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to add this (the snapshot will be filled in automatically when you run the test so that we see the names we ended up with.

Suggested change
assert len(m.last_model_request_parameters.output_tools) == 2
# Check that tool names don't contain brackets
tool_names = [tool.name for tool in m.last_model_request_parameters.output_tools]
for tool_name in tool_names:
assert '[' not in tool_name, f"Tool name '{tool_name}' contains brackets"
assert ']' not in tool_name, f"Tool name '{tool_name}' contains brackets"
# Verify the name follows the pattern [a-zA-Z0-9_-]
assert re.match(r'^[a-zA-Z0-9_-]+$', tool_name), f"Tool name '{tool_name}' contains invalid characters"
tool_names = [tool.name for tool in m.last_model_request_parameters.output_tools]
assert tool_names == snapshot()

@github-actions
Copy link

This PR is stale, and will be closed in 3 days if no reply is received.

@github-actions github-actions bot added the Stale label Oct 22, 2025
From pydantic#2929

We ensure that auto-generated tool output names are properly sanitized
so they conform to `[a-zA-Z0-9-_]`. All characters not in this pattern
will simply be ignored/skipped in the final name.

The error was first remarked when using generic classes as `output_type`,
where brackets in the tool name would be rejected by the provider API.

Added snapshot test to verify the sanitized tool names are generated
correctly from generic types like Result[StringData] and Result[int].

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@lionpeloux lionpeloux force-pushed the pr-tool-output-naming branch from 30e9921 to a854960 Compare October 23, 2025 04:37
@lionpeloux
Copy link
Contributor Author

Hi @DouweM, I've merged your suggestion. Thanks for your review.

@github-actions github-actions bot removed the Stale label Oct 23, 2025

def test_output_type_generic_class_name_sanitization(create_module: Callable[[str], Any]):
"""Test that generic class names with brackets are properly sanitized."""
module_code = '''
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can inline this as regular code -- we're only using a stringified module above because we're injecting code {union_code}.

@DouweM DouweM changed the title Add sanitization to auto-generated output tool name Sanitize auto-generated output tool name to support generic types Oct 23, 2025
As suggested by @DouweM, the test now defines classes directly inline
instead of using the create_module helper with a string, since we're
not dynamically generating code.

Note: The snapshot changed because inline classes have their function
scope in the qualified name, resulting in a longer sanitized tool name.
This still correctly tests bracket sanitization.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@lionpeloux
Copy link
Contributor Author

@DouweM I've inlined the code as you suggested.

Note about the snapshot change: The tool name changed from final_result_ResultStringData to final_result_Resulttest_output_type_generic_class_name_sanitizationlocalsStringData because when classes are defined inline within a function, their fully qualified name includes the function scope (e.g., test_output_type_generic_class_name_sanitization.<locals>.StringData).

The sanitizer processes this qualified name and strips out the dots and special characters, resulting in the longer name. This is correct behavior and still properly tests that brackets from generic types like Result[StringData] are sanitized.

Let me know if you'd prefer to keep the create_module approach to have cleaner tool names in the snapshot, or if this is acceptable.

@lionpeloux lionpeloux requested a review from DouweM October 23, 2025 14:37
@DouweM
Copy link
Collaborator

DouweM commented Oct 23, 2025

@lionpeloux Ah yeah that's a bit confusing. I'm OK with defining the classes at the top level of the file, I'd rather have that than a string module.

@lionpeloux
Copy link
Contributor Author

@DouweM Done! I've moved the classes to module level as you suggested.

Changes:

  • Added Generic and TypeVar to the typing imports
  • Defined ResultGeneric and StringData classes at module level (similar to the Person class pattern elsewhere in the file)
  • Simplified the test function to use these module-level classes
  • Tool names are now clean: final_result_ResultGenericStringData and final_result_ResultGenericint

This gives us the best of both worlds - readable inline code (no string modules) and clean snapshot names (no function scope pollution). Test passes successfully!

@DouweM
Copy link
Collaborator

DouweM commented Oct 24, 2025

@lionpeloux Don't forget to push :)

As suggested by @DouweM, moved ResultGeneric and StringData classes
to the top of the test file (after Person class) instead of defining
them inline within the test function. This approach:

- Avoids the string module approach (create_module)
- Provides clean tool names in snapshots without function scope
- Follows the existing pattern in the test file

Tool names are now clean: final_result_ResultGenericStringData and
final_result_ResultGenericint.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@lionpeloux
Copy link
Contributor Author

@DouweM Classes are now at the TOP of the file!

I moved ResultGeneric and StringData to line 100-112 (right after the Person class), with a descriptive comment explaining their purpose for testing tool name sanitization with generic types.

The test passes successfully with clean tool names in the snapshot.

@DouweM DouweM merged commit efa1e26 into pydantic:main Oct 24, 2025
31 checks passed
@DouweM
Copy link
Collaborator

DouweM commented Oct 24, 2025

@lionpeloux Thanks Lionel!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.